8 research outputs found

    Searching for a talking face: the effect of degrading the auditory signal

    Get PDF
    Previous research (e.g. McGurk and MacDonald, 1976) suggests that faces and voices are bound automatically, but recent evidence suggests that attention is involved in a task of searching for a talking face (Alsius and Soto-Faraco, 2011). We hypothesised that the processing demands of the stimuli may affect the amount of attentional resources required, and investigated what effect degrading the auditory stimulus had on the time taken to locate a talking face. Twenty participants were presented with between 2 and 4 faces articulating different sentences, and had to decide which of these faces matched the sentence that they heard. The results showed that in the least demanding auditory condition (clear speech in quiet), search times did not significantly increase when the number of faces increased. However, when speech was presented in background noise or was processed to simulate the information provided by a cochlear implant, search times increased as the number of faces increased. Thus, it seems that the amount of attentional resources required vary according to the processing demands of the auditory stimuli, and when processing load is increased then faces need to be individually attended to in order to complete the task. Based on these results we would expect cochlear-implant users to find the task of locating a talking face more attentionally demanding than normal hearing listeners

    Visual speech benefit in clear and degraded speech depends on the auditory intelligibility of the talker and the number of background talkers

    Get PDF
    Perceiving speech in background noise presents a significant challenge to listeners. Intelligibility can be improved by seeing the face of a talker. This is of particular value to hearing impaired people and users of cochlear implants. It is well known that auditory-only speech understanding depends on factors beyond audibility. How these factors impact on the audio-visual integration of speech is poorly understood. We investigated audio-visual integration when either the interfering background speech (Experiment 1) or intelligibility of the target talkers (Experiment 2) was manipulated. Clear speech was also contrasted with sine-wave vocoded speech to mimic the loss of temporal fine structure with a cochlear implant. Experiment 1 showed that for clear speech, the visual speech benefit was unaffected by the number of background talkers. For vocoded speech, a larger benefit was found when there was only one background talker. Experiment 2 showed that visual speech benefit depended upon the audio intelligibility of the talker and increased as intelligibility decreased. Degrading the speech by vocoding resulted in even greater benefit from visual speech information. A single “independent noise” signal detection theory model predicted the overall visual speech benefit in some conditions but could not predict the different levels of benefit across variations in the background or target talkers. This suggests that, similar to audio-only speech intelligibility, the integration of audio-visual speech cues may be functionally dependent on factors other than audibility and task difficulty, and that clinicians and researchers should carefully consider the characteristics of their stimuli when assessing audio-visual integration

    Cueing listeners to attend to a target talker progressively improves word report as the duration of the cue-target interval lengthens to 2000 ms

    Get PDF
    Endogenous attention is typically studied by presenting instructive cues in advance of a target stimulus array. For endogenous visual attention, task performance improves as the duration of the cue-target interval increases up to 800 ms. Less is known about how endogenous auditory attention unfolds over time or the mechanisms by which an instructive cue presented in advance of an auditory array improves performance. The current experiment used five cue-target intervals (0, 250, 500, 1000, and 2000 ms) to compare four hypotheses for how preparatory attention develops over time in a multi-talker listening task. Young adults were cued to attend to a target talker who spoke in a mixture of three talkers. Visual cues indicated the target talker’s spatial location or their gender. Participants directed attention to location and gender simultaneously (‘objects’) at all cue-target intervals. Participants were consistently faster and more accurate at reporting words spoken by the target talker when the cue-target interval was 2000 ms than 0 ms. In addition, the latency of correct responses progressively shortened as the duration of the cue-target interval increased from 0 to 2000 ms. These findings suggest that the mechanisms involved in preparatory auditory attention develop gradually over time, taking at least 2000 ms to reach optimal configuration, yet providing cumulative improvements in speech intelligibility as the duration of the cue-target interval increases from 0 to 2000 ms. These results demonstrate an improvement in performance for cue-target intervals longer than those that have been reported previously in the visual or auditory modalities
    corecore